See Spot Run: Using Spot Instances for MapReduce Workflows
نویسندگان
چکیده
MapReduce is a scalable and fault tolerant framework, patented by Google, for computing embarrassingly parallel reductions. Hadoop is an open-source implementation of Google MapReduce that is made available as a web service to cloud users by the Amazon Web Services (AWS) cloud computing infrastructure. Amazon Spot Instances (SIs) provide an inexpensive yet transient and market-based option to purchasing virtualized instances for execution in AWS. As opposed to manually controlling when an instance is terminated, SI termination can also occur automatically as a function of the market price and maximum user bid price. We find that we can significantly improve the runtime of MapReduce jobs in our benchmarks by using SIs as accelerators. However, we also find that SI termination due to budget constraints during the job can have adverse affects on the runtime and may cause the user to overpay for their job. We describe new techniques that help reduce such effects.
منابع مشابه
Experimental Study of Bidding Strategies for Scientific Workflows using AWS Spot Instances
Spot instance is an auction based Amazon Elastic Compute Cloud (EC2) instance provided by Amazon Web Service (AWS). It aims to help users to reduce their resource renting cost. The price for spot instances sometimes can be as low as one tenth of the price of the same type on demand instances. However, while gaining significantly cost savings on renting resources, users take risks on running ins...
متن کاملHow to Bid the Cloud
Amazon’s Elastic Compute Cloud (EC2) uses auction-based spot pricing to sell spare capacity, allowing users to bid for cloud resources at a highly-reduced rate. Amazon sets this spot price dynamically and accepts user bids above this price. Jobs with lower bids (including those already running) are interrupted and must wait for a lower spot price before resuming. We answer two basic questions f...
متن کاملResource provisioning in spot market-based cloud computing environments
Recently, cloud computing providers have started offering unused computational resources in the form of dynamically priced virtual machines (VMs), also known as “spot instances”. In spite of the apparent economical advantage, an intermittent nature is inherent to these biddable resources, which may cause VM unavailability. When an out-of-bid situation occurs, i.e. the current spot price goes ab...
متن کاملCutting MapReduce Cost with Spot Market
Spot market provides the ideal mechanism to leverage idle CPU resources and smooth out the computation demands. Unfortunately, few applications can take advantage of spot market because they cannot handle sudden terminations. We describe Spot Cloud MapReduce, the first MapReduce implementation that can fully take advantage of a spot market. Even if a massive number of nodes are terminated regul...
متن کاملProvisioning Spot Market Cloud Resources to Create Cost-Effective Virtual Clusters
Infrastructure-as-a-Service providers are offering their unused resources in the form of variable-priced virtual machines (VMs), known as “spot instances”, at prices significantly lower than their standard fixed-priced resources. To lease spot instances, users specify a maximum price they are willing to pay per hour and VMs will run only when the current price is lower than the user’s bid. This...
متن کامل